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Abstract—Some_ theoretical and experimental models have been 
considered for the prediction of the path loss in mobile communications 
systems. However, one knows that in real environment, the received signal 
is subject to variations. The model developed for an urban area cannot 
give resulted acceptable for different urban areas since that, each model 
has different parameters in accordance with the considered area. This 
paper presents the results of propagation channel modeling, based on 
multivariate time series models using data collected in measurement 
campaigns and the main characteristics of urbanization in the city of 
Belem-PA. Transfer function models were used to evaluate effects on the 
time series of received signal strength (dBm) which was used as the 
response variable and as explanatory variables of the height of buildings 
and distances between buildings. As time series models disregard to the 
possible correlations between neighboring samples, we used a 
geostatistical model to establish the correctness of this model error. The 
results obtained with the proposed model showed a good performance 
compared to the measured signal, considering the data of the eleven 
routes from the center of the city of Belém/Pa. From the map of the spatial 
distribution of the received signal strength (dBm), one can easily identify 
areas below or above dimensional in terms of this variable, that is 
benefited or damaged compared with the signal reception, which may 
result in a greater investment of the local operator (concessionaire mobile 


phone) in those regions where the signal is weak. 





I. INTRODUCTION 


Nowadays there is a great variety of communications 
theories and 


channel models, with fundamental 


experiments with a prediction on path loss in mobile 
communication systems. These models differ in their 
applicability, on different types of terrain and different 
environmental conditions. Thus, there is not an existing 
appropriate model for all situations. In real cases, the 


terrain on which 
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the propagation presents 


topography, vegetation and constructions are randomly 
distributed. Although the propagation loss calculation can 
be performed, although with limited accuracy, using 
techniques such as ray tracing or numerical solutions for 
approximations of the wave equation. 


The propagations models are generally based on the 
deterministic models (Liaskos et al., 2018; Salous, 2013; 
Shu Sun et al., 2014)[1-3] and modified based on results 


varied obtained from measurement campaigns in one or more 
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regions [2]. The models obtained are given by expressions 
that provide the median value of attenuation, like the 
models of Okumura-Hata(Arthur et al., 2019) [4]consisting 
analytical expressions of the average attenuation route, for 
urban areas, suburban and open (rural). These formulations 
are limited to certain ranges of input parameters, and are 
applicable only to land almost flat and are valid for 
frequencies of 150 to 1500 Mhz. and the model of 
Ibrahim-Parsons(Rozal et al., 2012) [5], which takes into 
account factors such as the degree of urbanization, land 
usage, and the variation in height between the mobile 
station (MS) and the base transceiver station (BTS). These 
empirical characteristics were extracted from 
measurements taken in the city of London, on frequencies 
between 168 and 900 MHz. This model was studied in 
urban areas without undulations. It is used for distances 
between antennas smaller than 10 km and receiving 


antenna height of less than 3 m. 


The model of Walfisch—Ikegami(Alqudah, 2013) [6]has its 
formulation based on characteristics of urban regions, such 
as density and average height of buildings, and the width 
of the streets. This model is effective in cases where the 
height of the antennas BTS is smaller than the average 
height of buildings a situation where there is considerable 
guidance signal RF along the routes considered. This 
model predicts two different situations for calculating the 
average attenuation path between BTS and the mobile: The 
line of sight (LOS— line of sight and Non-line-of-sight 
(NLOS). 


This paper presents a model for time series to characterize 
the received signal strength (dBm) in eleven pathways 
downtown of Belém/PA. The work consisted in the study 
of the possible relationship between this received signal 
strength and the behavior of the height of the buildings and 
the distance between. Transfer function models were used 
to assess effects on time series of the received strength and 
to evaluate the relationship between the height of the 
buildings and the distance between buildings. 


For error correction model in time series, instead of using 
another ARIMA model, a spatial geostatistical model 
based on kriging was used. This module includes a set of 
required procedures for geostatistical techniques 
(exploratory analysis, generation and modeling of a 
semivariogram and kriging). With the objective an analysis 
in two dimensions for spatially distributed data, with 
respect to interpolation of surfaces generated from the geo- 
referenced samples obtained from the received strength. 


IH. RELATED WORKS 


The literature analysis of propagation models has 
investigated different statistical prediction methods to 
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identify appropriate techniques for thispurpose. Currently, 
many propagation channel models employ the most varied 
modeling techniques, such as time series modeling and 
geostatistics. In(Konak, 2010)[7] estimated signal 
propagation losses in wireless LANs using Ordinary 
Kriging (OK). In (Phillips et al., 2012) [8] used OK on a 
2.5 GHz WiMax network to produce radio environment 
maps that are more accurate and informative than 
deterministic propagation models. In(Kolyaie et al., 
2011)[9] used drive-tests to collect signal strength 
measurements and compared the performance of empirical 
and spatial interpolation techniques.(Y. Zhang et al., 
2012)[10] developed a methodology based on time series 
analysis and geostatistics through experiments using a real 
dataset from the Swiss Alps. The results showed that the 
developed methodology accurately detected outliers in 
wireless sensor network (WSN) data, by taking advantage 
of their spatial and temporal correlations. Edilberto Rozal 
et al. [5] presented results of propagation channel 
modeling, based on multivariate time series models and 
the main characteristics of urbanization in the city of 
Belém/PA by using data collected in measurement 
campaigns. Transfer function models were used to 
evaluate the relationship between the received signal 
strength and other variables, such as building’s height, 
distance between buildings, and distance to the radio base 
station, which were recorded in a street in the city center of 
Belém/PA, Brazil.(Karunathilake et al., 2014)[11] studied 
location-based systems to investigate the availability of 
signal reception levels, specifically 3G and 4G signals. 
The study was based on geostatistical analysis using the 
inverse distance weighting (IDW) method.(Molinari et al., 
2015)[12] empirically studied the accuracy of a wide range 
of spatial interpolation techniques, including various forms 
of Kriging, in different scenarios that captured the unique 
characteristics of sparse and non-uniform measurements 
and measurements in imprecise locations. The results 
obtained indicated that ordinary Kriging was an overall 
fairly robust technique in all scenarios.(Wen-jing et al., 
2017)[13] proposed a traffic prediction method based on 
the seasonal autoregressive integrated moving average (S- 
ARIMA) model, according to the characteristics of the 
network traffic and its respective implementation.(K. 
Zhang et al., 2019)[14] proposed a system for traffic 
analysis and prediction suitable for urban wireless 
communication networks, which combined actual call 
detail record (CDR) data analysis and multivariate 
prediction algorithms.(Mezhoud et al., 2020) [15]proposed 
an approach for coverage prediction based on the 
hybridization of the interpolation technique by OK and a 
Neural Network with MLP-NN architecture, this 
methodology was motivated by the lack of quality of the 
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MLP-NN test database, which satisfactorily enriched the 
network's training dataset.(Song et al., 2020)[16] used a 
novel secure data aggregation solution based on the 
ARIMA model to prevent tracking of private data by 
opponents.(Faruk et al., 2019)[17]evaluated and analyzed 
the efficiencies of empirical, heuristic and geospatial 
methods for predicting signal fading in the very high 
frequency (VHF) and ultra-high frequency (UHF) bands in 
typically urban environments. Path loss models based on 
artificial neural network (ANN), adaptive neuro-fuzzy 
inference system (ANFIS) and Kriging techniques were 
developed. Sato et al.(Sato et al., 2021)[18]proposed a 
technique that interpolates the representative map of the 
mobile radio signal in the spatial domain and in the 
frequency domain. 


Hl. TIME SERIES 


A time series is a set of statistics, usually collected at 
regular intervals.Time series data occur naturally in many 
application areas, such as _ economics, finance, 
environmental and medicine. The methods of time series 
analysis pre-date those for general stochastic processes and 
Markov Chains. The aims of time series analysis are to 
describe and summarize time series data, fit low- 


dimensional models, and make forecasts [5]. 


We write our real-valued series of observations as 
...X_2,X_1,X9,X1,X2, .. 
real-valued random variables indexed by integers numbers. 


., a doubly infinite sequence of 


One simple method of describing a series is that of 
classical decomposition. The notion is that the series can 
be decomposed into four elements: 


Trend (T;) — long term movements in the mean; 


Seasonal effects (I) — cyclical fluctuations related to the 
calendar; 


Cycles (C+) — other cyclical fluctuations (such as a 
business cycles); 


Residuals (E+) — other random or systematic fluctuations. 


The idea is to create separate models for these four 
elements and then combine them, either additively: 


Xe= [etle Ce- E; (1) 
or multiplicatively: 
3.1. ARIMA Models 


Box and Jenkins [5] first introduced ARIMA models, the 
term deriving from: AR = Autorregressive, I = Integrated 
and MA = Moving average. 
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A key concept underlying time series processes is that of 
stationarity. A time series is stationarity when it has the 
following three characteristics: 


(a) Exhibits mean reversion in that it fluctuates around a 
constant long-run mean; 


(b) Has a finite variance that is time-invariant; 


(c) Has a theoretical correlogram that diminishes as the lag 
length increases. 


The autoregressive process of order p is denoted AR(p), 
and defined by 

Y, = Dia Pi Ye-i + Ct (3) 

Where @,,...,9, are fixed constants. Y, is expressed 
linearly in terms of current and previous values of a white 
noise series {e+}. This noise series is constructed from the 
forecasting errors; {e+} is a sequence of independent (or 


uncor-related) random variables with mean O and variance 


o2. 


Using the lag operator L (the lag operator L has the 
property: (L”Y, = Y,_,,) we can write the AR(p) model as: 


¥,(1 — pL — @2L*—...—PpL?) = €; (4) 
P(L)Y, = ee (5) 
Where @(L)Y,is a polynomial function of Y,. 


The moving average process of order q is denoted MA(q) 
and defined by: 


Y, =e, + a O; ei (6) 
Where,@,,..., Oq are fixed constants, Oo = 1, and {e+} is a 


sequence of independent (or uncorrelated) random 
variables with mean 0 and variance 0°. 


Or using the lag operator: 
Y; = (1 = 0L — 6,L7-. a ik —0, LI Jut (7) 
Y, = O(L)u, (8) 


The combination of the two processes to give a new series 
of models called ARMA (p, g) models, is defined by 


Ye = Diner Pitti + Ce + Die 9) etj (9) 
Where again {e;} is white noise, {g;/i = 1,2,.., p}are the 


coefficients of AR model and 6;/i = 1,2,..,q} are the 
coefficients of MA model. 


Using the lag operator: 
6,L7-... —6,L7) (10) 
P(L)Y, = O(L)e (11) 


According to the target model, the process is non- 
stationary, so the series should be transformed to a 
stationary process be the model construction. This can be 
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often achieved by a differentiation process.The first-order 
differencing of the original time series is defined as: 


AY, = Y; az Y; = Y; = BY, (12) 

For the high-order differentiation, we have: 

AĉtY, = (1 — B)*Y, (13) 

If we ever find that the differenced process is a stationary 
process, we can look for a ARMA model of that. The 
process {Y,} is said to be an autoregressive integrated 
moving average process, ARIMA(p,d,q). If X, = A*Y, is 
an ARMA (p, q) process. 

After the d-order differentiations of Y,in equation 10, the 


autoregressive integrated moving average (ARIMA), 
ARIMA (p,d,q), can be constructed as: 


P(L)YE = O(L)e; (14) 
A time series (TS) may be defined as a set of observations 
Y, as a function of time [5]. The principal tools utilized for 


analysis of a time series are the autocorrelation and partial 
autocorrelation functions. 


The autocorrelation function (ACF) represents a simple 
correlation between Y;and Y,_,as a function of the lag k. 
The autocorrelation function of TS {Y;} may be defined as: 
[5]. 

_ Do -¥) Yte-Y) 

ley) 

Where N represents the length of the TS and Yis the 
expected value from the observations, calculated for the 


p (15) 


time variation (delay) k. The autocorrelation coefficient (p) 
of a TS varies between —1 and 1. 


The partial autocorrelation function (PACF) represents the 
correlation between Y;and Y;_,as a function of the lag k, 
filtering the effect of the other lags on Y,and Y;_,. The 
partial autocorrelation function is defined as the sequence 
of correlations between (Y, and Y;_1), (Y; and Y;_2), (Y; 
and Y;_3) and so on, because the effects of prior lag on t 
remain constant. The PACF is calculated as the coefficient 
value ~;, 1n the equation: 


Ye = Pr1Yt-1 + Pr2Yt-2 + Pr3Yt-3 t... FO Men t et 
(16) 


3.2. Transfer Function Model 


Transfer function model is different from ARIMA model. 
ARIMA model is univariate time series model, but transfer 
function is multivariate time series model. This means that 
ARIMA model relates the series only to its past. Besides 
the past series, transfer function model also relates the 
series to other time series. Transfer function models can be 
used to model single-output and multiple-output systems 
[5]. In the case of single-output model, only one equation 
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is required to describe the system. It is referred to as a 
single-equation transfer function model. A multiple-output 
transfer function model is referred to as a multi-equation 
transfer function model or a simultaneous transfer function 
(STF) model [5]. 


Assume that X;and Y, are properly transformed series such 
that both are stationary. In a linear system with simple 
input and output, the series of X, input and Y, output are 
related through a linear filter as 


Where v(B) = X} Zo v;BÍ is referred to as a filter transfer 


function by Box and Jenkins and N; is a noise series of the 
system that is independent of the input series X; 


The coefficients in the transfer function model (17) are 
often called the impulse response weights. 


The objective of modeling the transfer function is to 
identify and estimate the transfer function WB) and the 
noise model for N, based on the information available for 
the input series X, and the output series Y,. The greatest 
difficulty is that information regarding X;,andY, is finite, 
and the transfer function in (17) contains an infinite 
number of coefficients. To alleviate this difficulty, the 
transfer function WB) is shown in the following rational 
form: [5] 


__ Ws(B)B? 
ara T (18) 
Wherew,(B) = Wọ —W,B-...—WB*, 6,(B) =1-—- 
6,B—...—6,B", and b is a lag parameter that represents 


the delay that elapses before the impulse of the input 
variable produces an effect on the output variable. For a 
stable system, it is assumed that the roots of 6,(B) = 0 lie 
outside the unit circle [5]. After obtainingw,(B), 6,(B) 
and b, the v; weights of the impulse response can be 


obtained by setting the coefficients of B/on both sides of 
the equation equal to one another: 


6,(B) v(B) = w,(B) B” (19) 


In practice, the values of r and s on the system (8) rarely 
exceed 2. Some transfer functions can be seen in [5]. 
These models may be used to identify the parameters of 
the transfer function. Analysis of these models show that 
the occurrence of peaks suggests parameters in the 
numerator of the transfer function, similar to models of 
moving averages, and the occurrence of an exponential 
decay behavior may indicate the existence of parameters in 
the denominator of the transfer function, similar to the 
autoregression models. 
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IV. GEOSTATISTICS 


Geostatistics is used in the spatial interpolation and 
uncertainty quantification for variables that exhibit spatial 
continuity, i.e, can be measured at any point of the area / 
region / area under study. Using traditional statistical 
concepts as random variable (VA) cumulative distribution 
(FDA), probability 
(PDF),expected value, variance, etc. These concepts can 


function density function 
be found in statistical textbooks. In geostatistics, the VA, 
represented by z(u), where u is the vector of coordinates 
of the location, is related to some location in space. In this 
case, the main statistics are set out below. The cumulative 
distribution function (FDA) gives the probability that the 
VA Z is less than or equal to a certain value z, generally 
called cutoff value(Chilés & Delfiner, 2012; Gooverts, 
1984; Isaaks, 1990; Johnston et al., 2001; Pyrcz & 
Deutsch, 2014; Shiquan Sun et al., 2020; Tobler, 
1989)[19-25] 


4.1 Description of Spatial Patterns 


In earth science is often important to know the pattern of 
dependence of one variable X over another Y.The joint 
distribution of results of a pair of random variables Xand 
Yis characterized by the FDA joint (or bivariate) defined 
as: 


Fyy (x,y) = prob{X <x; Y < y} (20) 


estimated in practice the proportion of data pairs below the 
respective joint values (cutoff values) x and y. This can be 
shown in the scatter diagram (Fig. 1) in which each pair of 
data (xi,yi)1s plotted as a point. 


The degree of dependence between the two variables 

X and Y can be characterized by the dispersion around 45° 

in the scattergram. The great reliance (X = Y) matches all 

experimental pairs (x;yi), i = 1,..., N plotted on the line 45°. 
y 






0 . 
45 line 


| x@ - y@ | 


Fig. 1: Pair (x; ,yi) on a scattergram 
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The moment of inertia of the scattergram around the 45° 
line — called "semivariogram" for all pairs (xi,yi) — is 
defined as half the average of squared differences between 
the coordinates of each pair, 1.e.: 


1 1 
Vay = Lier A? = Lisi — Vi)? (21) 


The higher the value of the semivariogram, the greater 


dispersion and less closely related are the two variables 
X and Y. 


In problems of spatial interpolation, where one want to 
infer (map) a certain area for a given property, z(u), ue 
area A, starting from a sample n of z(u). The combination 
of all nh) pairs of data ofz(u), over the same 
area/zone/layer/population A with such pairs separated by 
approximately the same vector h (in length and direction), 
allows estimating the semivariogram characteristic (or 
experimental) of the spatial variability in A: 


y(h) = — IN [z(ua) — Z(ua + h)]? 


a=1 


1 
2N(h) 





(22) 


An experimental semivariogram (22) is an estimate of an 
integral discrete space defining a well determined on 
averageA: 


1 


A ares filz) — z(u +h)ļ?’ du foru, uthea. 
(23) 


Such as a VA z(u) is and its distribution characterizes the 
uncertainty about the value of certain property located at 
u, a random function z(u), u € A, defined as a set of VA’s 
dependent feature of joint spatial uncertainty about A. The 
semivariogram of this random function characterizes the 
degree of spatial dependence between two random 
variables z(u) and z(u + h) separated from the vector h. 


For the modeling of the semivariogram conducted after 
building the experimental semivariogram, it is necessary 
that the hypothesis is considered stationary. This 
hypothesis states, in summary, that the first two moments 
(mean and variance) of the difference [z(w) — z(u + h)| 
are independent of location “ and function only for the 
vector h. The second moment of this difference 
corresponds to the semivariogram, 1.e: 


2y(h) = E{[z(u) — z(u + h)]*}is independent to u € 
A.(24) 
Developing the equation above (adding m° to all terms for 


convenience), one obtains: 


2y(h) = C(0) =- C(h), (25) 


and that: 
Var{Z(u)} = Var{Z(u + h) = o? = C(0) for all we 
A. (26) 
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Cov{Z(u), {Z(ut+ h) = C(h)forallu € A (27) 


The relation (25) is then utilized to determine the 
semivariographic model. The variance C(O) is called in 
geostatistics a baseline (or sill). The semivariogram can be 
defined as the graph of the semivariance function versus 
distanceh,is a technique used to measure the dependence 
between sample points, distributed according to a spatial 
reference and for interpolation of values required for the 
construction of isoline maps [19]. According to 
Christakos(Christakos, 1984)[26], is the preferred tool for 
statistical inference because it offers some advantages over 
the covariance, including: 


1) Its empirical calculation is subject to minor errors; 


ii) Provides a better characterization of the spatial 
variability; 
111) Requires the called intrinsic stationarity assumption, 
i.e. that z(u) is a random function with stationary 
increments z(u + h)— z(u), but not necessarily itself 
stationary. 


The semivariogram is the preferred tool for statistical 
inference because it offers some advantages over the 
covariance [19]. For a continuous function is selected a 
semivariogram necessary to satisfy the property of positive 
definite. In practice are used linear combinations in basic 
models that are valid, i.e., permissible. One of the most 
used basic models in geostatistics is the spherical model, 


given 
by: 
3 /\h 1 (\hl\3 
Vac (=) -+(*) 0< lhl <a (28) 
C |h| >a 


The components C anda are denominated the level and 
range, respectively. The level, also known as "sill" 
represents the variability of the semivariogram to its 
stabilization. The range (or variogram range) and the 
distance are observed up to the level where the variability 
stabilizes. Indicates the distance in which the samples are 
spatially correlated (Fig. 2). 
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Range (a) 


Semivariogram 
Sill (C) 


Í Nugget effect C3 


Distance (h) 


Fig. 2: Parameters of the semivariogram 


4.3 Ordinary Kriging 


Kriging is a interpolation technique in which the 
surrounding measured values are weighted to derive a 
predicted value for an unmeasured location. Weights are 
based on the distance between the measured points, the 
prediction locations, and the overall spatial arrangement 
among the measured points. Kriging is based on 
regionalized variable theory, which assumes that the 
spatial variation in the data being modeled is homogeneous 
across the surface. Ordinary Kriging (OK) considers the 
local variation of the mean limited to the domain of 
stationary of the average local neighborhood W (u) 
centered on the location u to be estimated [24-25]. In this 
case, one considers the common average (stationary) m(u) 
in equation 43, e.1.: 


Z*(u) = DOA, (u)z(ug) + [1 - 


a=1 
Latr Ag (u)]m(u) (29) 
The mean m(u) unknown can be eliminated by 


considering the sum of the weightsA, (wu) of kriging equal 
to 1. This mode: 
Zkos (U) = Zac [AK0 (u)z(uq), with DELP AL? (u) = 
1(30) 
The minimization of the error variance (Var|Z*(u) — 
Z(u)]) under the condition pee [AK (u) = 1, allows to 
determine the weights Aa from the following system of 
equations called ordinary kriging system (normal 
equations with constraints): 


ie Age (u)C(ug — Ug) + y(u) = C(u — ua) 
E-A (u)=1 


where C (ug — Ug) and C(u — ug) are, respectively, the 


31 
a=1,....n on) 


covariance between the points ugand Ug, u and Ug. Huis 
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the Lagrange parameter associated with the restriction: 
Ari Age (u) = 1. 
The kriging system (31) presents only one solution if: 


i) The covariance function C (h) is positive-definite, 1.e.: 


Var{} Y- ha Z(Uq)} = zi 2a dade C(ug —_ ug) = 
0(32) 


11) There are not two completely redundant data, i.e, Ug # 
Ug if a +P. 


The corresponding minimum variance of the error, called 
the kriging variance is given by: 


doĝo = Var[Z(u) — Z*(u)] = Co — XLC? Ag C (ug — 
Ug) — (u) (33) 
where Cy = Var{Z(u)} = o°. 


Substituting the expression for its covariance C (h) = Co — 
y(h), the system (31) and the variance of, can be written 
as a function of the semivariographic modely(h). 


Therefore, unlike the more traditional linear estimators, 
kriging uses a system of weights that considers a specific 
model of spatial correlation, variable to the area A under 
study. Kriging provides not only a least squares estimate of 
the variable being studied, but also the variance error 
associated(D. Istok & A. Rautman, 1996) [27]. 


V. MATERIALS AND METHODS 
5.1 Database 


A local telecommunications company provided technical 
characteristics of broadcast stations and the received signal 
of the routes described. This area is the urban center of 
Belém/PA. The acquisition of vertical and tested measures 
of the buildings and homes, totaling approximately 4500 
points (between residents and buildings) was done by 
AUTOCADMAP and ORTOFOTO obtained with a plant 
scanned fromthe Company for Metropolitan 
Development and Administration of Belém - CODEM. 
Belém, capital of the state of Pará, belonging to the 
Metropolitan Mesoregion of Belém with an area of 
approximately 1 064,918 km?, located in northern Brazil, 
with latitude -01° 27' 21" and longitude of -48° 30' 16", 
altitude of 10 meters and distance 2 146 Km of Brasilia. Is 
known as "Metropolis of the Amazon", and one of the ten 
busiest and most attractive of Brazil. The city of Belem is 
considered the biggest of the equator line, is also classified 
as a capital with the best quality of life in Northern 
Brazil.Fig.3 shows the routes used in the measurement 
campaign. 
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Fig. 3: Sampling points for power measurement in the 
study area [5] 


oo) 





5.2 Methodology 
5.2.1 Analysis in Time Series 


For the statistical analysis of received power along the 
pathways under study, was used time series model with the 
use of transfer function for modeling multivariate data sets 
of received power primarily along the eleven previously 
mentioned pathways, considering as the response variable 
and the received power variable distance between the 
transmitter and receiver, the distance between the height of 
buildings and buildings as covariates. All analyzes were 
performed using programs developed with the routines of 
the statistical soft SAS(SAS/ETS 9.1 User’s Guide, 2004) 
[28], which through the subroutine proc arima held the 
adjustment of ARIMA models. This adjustment, which is 
performed iteratively, consists of three steps. The first is 
the identification of the model, where the observed data is 
transformed into a stationary series. The second step is to 
estimate the model in which the orders p and q are 
selected, and the corresponding parameters estimated. The 
third step is the prediction, in which the estimated model is 
used to predict future values of the time series considered. 


The Figs. 4 to 6 present the graphs of the series which will 
be analyzed with data collected in eleven ways of the 
measuring campaign. 
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Fig: 6: Height of the buildings (m) 


32k, for the 
Explanatory Variables - Identification of Time Series 

This 
generating the series, which filters (ARIMA models) and 


Adjustment of Univariate Models 


phase consists in determining which process 
their orders. The completion of the identification process, 
in addition to graphical analysis, needs in general the 
interpretations of the autocorrelation function and partial 
autocorrelation function. In this study, the identification of 
each series was conducted using the soft SAS. For the 
series received power was applied a difference to make it 
stationary. In all cases, the estimated parameters were 
significant autocorrelation and residues had no significant, 
a sign acceptable fit as shown in Table 1. As of now the 
response variable of received power will be denoted by Y4 
and the explanatory variables distance between buildings 
and height of buildings by X,,andX>,, respectively. 


From the analysis of the autocorrelations and partial 
autocorrelations preliminary models were adjusted for the 
series (p indicates the significance of the estimate); the 
results are shown in Table 2. In all cases, the estimated 
parameters were significant autocorrelations and residuals 
showed no significant signal adjustment acceptable for the 
model. 


Table 1: ARIMA model adjusted to the series input. 


P, > x? 


Series (variable) 
0.3946 
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Table 2: ARIMA model adjusted to the series input. 


Adjusted Model 
Varladle 
Yq 


Ya = Ya-1 — 


p<0,0001 


p<0,001 p<0,0171 


p<0,047 p<0,045 


Where d: is the distance index, Y,,X1q and X34 are the 
variables; a,4,@2q and azg are random errors, p is p- 
value. 


To identify the model transfer function suitable for a data 
set, one must consider the graph of the cross-correlation 
function sample. For the cross-correlation function be 
meaningful, the series of input and response should be pre- 
filtered. 


For pre-filtering the series of input and response 
appropriate to analyze the correlation, the procedure is as 
follows: 


1. Adjusting an ARIMA model to the series input so that 
the model residuals are white noise; 


2. Filter the host response to the same template used to 
input the serial:; 


3. Making the cross-correlation of the series of filtered 
response to the filtered input string to determine the 
relationship between the series; 


4. Interpret the cross-correlation graph in the same way a 
graph of the autocorrelation function. Indicators 
autoregressive s terms indicate the denominator and 
indicators moving averages indicate terms of the 
numerator. 


The graph of cross correlation pre-filtered with a transfer 
function numerator terms g and p in accordance with the 
denominator shows the same pattern after slags, such as 


0,91 Qid-1 + Aid 
p<0,0001 


Xia = 128,58 + 0,077 Xid-7 + Aq 
p<0,0151 


Xoqg = 14,870 + 0,076 X>4-2 — 0,068 X24-9 — 0,081 Xog_11 
p<0,034 p<0,0109 


< 0,063 Xəd-12 < 0,064 Xəd-14 — 0,066 Xoq-15 + A3zq 
p<0,038 





Arima(0, 1,1) 
Arima(1,0,0) 


Arima(6,0,0) 


the graph of the autocorrelation function of an ARMA 
process (p,g). This is the key to identify the transfer 
function. Such behavior is not guaranteed without pre- 
filtering, however. The ARIMA procedure automatically 
makes the pre-filtering when including the appropriate 
declarations in code soft SAS [28]. 


The adjusted model for the received signal power (Ya) 
includes explanatory variables X,, (Distance between 
buildings) and X5,(Height of building) and, according to 


the analysis of cross correlations and after a few attempts, 
the following transfer function model was specified: 


2 
WotW25 0 Xs 4 NOD) 


— 0 
Ya (1+81B) 


— (1-61B—6gB9) 14 


The Tables 3 and 4 show estimates of the model 
parameters of the transfer function obtained through a 
program of soft SAS and residual analysis for the model 
obtained, respectively. It is observed that statistics of 
cross-correlations with the waste input variable were not 
significant, i.e, the model transfer function provides a 
proper fit to the data. All parameters showed significant 
estimates, but the check of residual autocorrelations shows 
significant value in lag 1 (in bold) as shown in Table 4, 
this indicates that the residuals of this preliminary model 
are not white noises. That is, it is necessary to estimate 
parameters for the error process (N4) for this model. 


Table 3: Estimates and statistics of transfer function model obtained by iterative (SAS) 


Numerator) | - | 261 | 0.0000 


[Denominator a2) | 012130 | 269 | 00071 | 9 
251 


Numerator 2 
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Table 4: Residual analysis for the model 








[uaa e TR 


The equation of the model in notation B of a delay 
operator can be written as: 


0.0055 + 0.002878? x 
4 (1 +0.7799B — 0.1213B°)° 1° 
0.0299 


=F EN 
(1+0.78529B) 77+ 4 


(35) 


The Figs. 7 and 8 show the autocorrelation function (ACF) 
and a Partial autocorrelation function (PACF) for the 
residuals. It is clearly observed a high correlation value for 
lag 1 in Fig. 7, evidencing a high correlation between the 
residuals. This residual analysis can indicate possible 
missing terms in the model. 
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Fig. 7: Analysis for autocorrelation functions (ACF) of 
residuals (Na) 
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Fig: 8: Analysis for partial autocorrelation functions 
(PACF) of residuals (Na) 


5.2.2. Geostatistical analysis 


In the previous section, we estimated a model in time 
series with transfer function models in which the residues 
(Na) these models are not white noise. 
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| 6 | 240.31 | <.0001 | -0.498 | 0.006 | -0.024 | 0.003 | 0.014 | 0.028 | 


Cross correlations 


Note that the adjusted model was considered only the 
macro-localized features existing in the data of the residue 
of the received signal power, calculated by the time series 
model, it is not yet taken into account the influence that the 
data have on its neighbors, the small spatial scale. In other 
words, the residuals of this model are still present in two 
components: ¢'(x)+e”, is only œ” is distributed 
independently. In this case, the estimated residuals may 
still be contaminated by the effect of spatial dependence 
on small spatial scale.(Fischer & Nijkamp, 1992) [29]. 


5.2.2.1. Spatial Autocorrelation Diagnosis 


One of the ways to diagnose the presence of spatial effects 
in the data of the residue of the time series model is 
previously calculated by graphical analysis of the 
experimental semivariogram. The spatial inference is 
performed by kriging process which is based on the 
Theory (RVT). This 
identifies the spatial distribution of a variable is expressed 


Regionalized Variable theory 
by the sum of three components: one structural component 
having a constant mean or trend; one spatially correlated 
random component, also called regionalized variation; one 
spatially uncorrelated random component (residual error). 


The analysis of spatial variability of residuals in time 
series models, calculated by the equation, is carried out 
with the aid of a semivariogram. This is one of the most 
important steps of the geostatistical analysis, because the 
semivariogram model chosen represents the spatial 
correlation structure to be used in inferential procedures of 
kriging. The results presented in Fig. 9 shows the 
omnidirectional semivariogram (isotropic case) and its 
adjustment model. 
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Fig 9: Experimental variogram and theoretical of residues 


Whose parameters are the equation 36, is the nested type 
(double spherical), that is, a combination of two spherical 


models. 
150 150\2 500 
y(h) = 75 +40 (2 — 1,5 (=) ) +20 (Se - 
500\2 
1,5 (=) ) (36) 
where: 


a; = 150 andC; = 40 correspond to the range of parameters 
and input, respectively, of the first spherical model (71 (n)). 


a2 =20 andC = 500 correspond to the range of parameters 
and input, respectively, of the second spherical model 


(Y2(n)). 


Based on the structure defined as nested semivariogram 
(double spherical was performed a spatial inference spatial 
the kriging process, obtaining the map of the spatial 
distribution of power received through the program 
SURFER(AI-sudani, 2019; Bresnahan & Dickenson, n.d.) 
[30-31]. 

With equation 50 establishes the geostatistical modeling of 
residual (N4) acquired by the time series model (equation 
35). The model establishing calculating of the received 
power in the searched area is given by: 


—0,0055 + 0,00287B" 


Ya = 75 +X 
ý (1 + 0,7799B — 0,1213B°)° 1% 


0,0299 . ne 150 15 (2) fi 
(1 + 0,78529B)° 24"? 2h 20” Nh 


500 500\3 
20 (= -1,5 (#9) ) + e4(37) 
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where h is given in meters, 
eq: random error. 


Analyzing autocorrelation functions and partial 
autocorrelation of the residues model found (equation 37) 
of Figs. 10 and 11. One can notice a substantial decrease in 
the residual autocorrelation (eg) when compared to the 
residues obtained from the time series model (equation 
35). Note that there is a significant decrease in the value of 
the autocorrelation for lag 1 (Fig. 10) when compared with 
Fig. 8. Since non of the lags present significant spike, soon 
can be stated that the number of residues of the simulated 
model is stationary (eg is a white noise). 
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Fig 10:Analysis for autocorrelation functions (ACF) of 
residuals( Na) 
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Fig 11:Analysis for partial autocorrelation functions 
(PACF) of residuals (Na) 


5.2.2.2. Spatial Inference with Kriging 


Using the parameters of the semivariogram was held to 
spatial inference through the process of kriging, obtaining 
the spatial distribution map of the received signal power 
(dBm) (simulated model — equation 37) shown in Fig.12. 
Fig.13 shows the spatial distribution of color levels which 
provides information on the pattern of distribution of the 
received power (dBm) obtained in the measurement 
campaign. One can observe the potential of the 
methodology adopted, when comparing the maps indicate 
the spatial distribution of the received power (dBm) by the 
receiving unit. 
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Fig 12: Spatial distribution map of received power (dBm) 
for Model simulated 


Note that the kriging shown in Fig. 12 and 13, where one 
can observe the areas with greater or lesser value of receipt 
of received power, according to the gradient of colors that 
allows visual analysis faster and simpler of the study area. 
Note that there is some agreement between the profile 
displayed by the maps obtained with the predicted values 

with the simulated model (Fig. 12) and field data (Fig. 13). 


The spatial distribution of values shows the regions 
marked in red as higher levels of received power (dBm). 
The regions in green and blue are areas with lower signal 
intensity. As might be expected, it is observed higher 
levels of power to the vicinity of the base station and other 
areasnot far from the BTS. 
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Fig 13:Spatialdistributionmapofreceivedpower (dBm) 


for measurementcampaign 


Note also that much of the region that has a low 
levelsignal is located at great distances from BTS, 
however, in the lower left corner of the maps (which are 
located AvenidasNazaré and Gentil Bittencourt), there is a 
region of low signal intensity, which can be explained by 
the higher incidence of heights of buildings and also with 
tunnels formed by mango trees present on these two 
pathways. 


Fig. 14 the graph shows the response for the model and 
observed values of the response variable power received 
through simulated time series models with error correction 
using a geostatistical model. The confidence intervals of 
95% are indicated by the yellow shaded area. 
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Fig 14:Power of the signal received by the mobile station and estimated by theoretical models and simulated 


VI. COMPARATIVE ANALYSIS OF THE RESULTS Ibrahim-Parsons and Okumura-Hataand the response of 
After conducting the comparative analysis of the results the 
obtained in measurements with the prediction of simulated model time series with error correction 
theoretical models of Ikegami-Walfisch, Ibrahim-Parsons developed in this work, to eleven routes from the center of 
and Okumura-Hata and the simulated model. The the city of Belém/PA. Table 5 presents the values of the 
theoretical models consider when calculating the signal mean square error, the mean and standard deviation. 


attenuation parameters such as height of the transmitting Table 5: Comparison between the three theoretical models 


antennas of receptor, types of cities, the average height of and the measured value for the pathways involved in the 


the buildings, width of streets, operating frequency and measurement campaign 
types of urbanization (rural suburban and urban), and the 


dependence on the distance 


i l Mean Mean Standard 
The parameters used in the analysis of models had the Squared (dBm) Deviation 


following values: The receiving antenna height: h, = Measured | = |80. 7875 15.7682 
15m; the receiving antenna height: h; = 35m; the -89.6232 18.0632 
operating frequency: f = 877.44 MHz and average width 39.79 = 9.6217 
of the street: W = 22 m. 13.26 | -79.8502 | 11.0363 

For the qualitative analysis of the measures with 16.56 -76.2557 8.8080 


forecasts. In Figure 15 we present the experimental results 





and theoretical simulations performed by the models of 
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Fig. 15: Power of the signal received by the mobile station and estimated by theoretical models and simulated 


Observing Table 5, it is found that, among the models 
given above, what best describes the signal in the study 
area is the simulated model developed in series with error 
correction using geostatistics, which presented an root 
mean square error 0.33 dB and the standard deviation 
18.0632. However, between theoretical models, which is 
closer to the measurements of the model of Ibrahim- 
Parsons which showed a root mean square error of 13:26 
dB and the standard deviation 11.0363. The model of 
Okumura-Hata presented the worst results. 


In order to make a study of the performance of the 
proposed model, in order to check its validity, analysis was 
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performed using data collected in another way of 
measuring campaign (RuaConselheiro Furtado), which 
was not part of the data processing for obtaining the 
coefficients entered into the equation (37). Table 6 shows 
the results of experimental simulations performed by the 
theoretical model of Ikegami-Walfisch, Ibrahim-Parsons 
and Okumura-Hataand the response of the simulated 
model obtained for the RuaConselheiro Furtado.Fig. 16 
shows the comparative graph of received signal strength 
versus distance from the base station to RuaConselheiro 
Furtado. 


Simulated Model 


Measure 
Iikegami-Walfisch Model 
Iibrahim-Parsons Model 
— =— = Okumura-Hata Model 
1500 1600 1700 
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Fig 16: Power ofthesignalreceivedbythe mobile stationandestimatedbytheoretical models andsimulated 


for Rua Conselheiro Furtado. 
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Table 6: Comparison between the results obtained for the 
three theoretical models, simulated model and the 


measured value for RuaConselheiro Furtado 


Mean Mean Standard 
|_| saurea | can | evtton 
(feos | 90856 
Coa |- os 


Measured 
Proposed 


Ikegami- 37.73 | - | 3.8444 
Ibrahim- 14.24 P| 3.1487 
Okumura-Hata 18.40 Lo= | 2.7320 


VII. CONCLUSIONS 





In this work was presented a time series model with error- 
correlation through geostatic, which consist a forecast of 
received power along the eleven pathways localized in the 
urban center of the city of Belém/PA. There has been 
shown a study of possible relationships existing between 
the received power signal, the height of buildings, and also 
the distance between the buildings. Transfer function 
models have been used for assessment of the received 
power signal in time series. As the models of classical 
statistical ignore the possible correlations between 
neighboring samples, not exploring in a satisfactory 
manner, the relationships that may exist between the 
sampling units. The error correction model time series was 
performed using a geostatistical model that considered the 
georeferencing data, which allowed the identification of 
these interaction effects in the same space, using kriging 
process. 


The proposed model showed a good result with mean 
Square error in the order 0.33 dB compared to the 
measured signal, considering the data of the eleven ways 
of the measurement campaign; while for models of 
Ibrahim Parsons and Okumura-Hata this error was around 
13:26 and 16:56 dB, respectively. 


The results show that the adjusted models retained the 
same characteristics of the original signal. Moreover, this 
methodology allows for individualized assessment of all 
points in the region considered, from the knowledge of 
their geographical coordinates, and not only the 
demonstration of generic values as occurs in traditional 
preparation of propagation models Thus, we obtained 
estimates of statistics, graphs and maps of dispersion and 
surface to describe the behavior of spatially variable 
received signal strength (dBm) in the center of the city of 
Belém/Pa. 


Therefore, the study conducted through statistical analysis 
in time series with transfer function models and the study 
of spatial variability of the variables of interest allowed the 
construction of a model that allowed us to identify by the 
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spatial plant the measurements of received power (dBm) 
and the gradient of the lines of iso-values, the vectors for 
better signal reception issued by BTS, identifying 
homogeneous areas, as well as those where users are 
harmed or benefited from the service of the local operator. 
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